Human Development, Gender Equality, and Suicide#
Student names: Mitch Boontjes, Lloyd de Rouw, Julian
Team number: J4
Show code cell source
# Load image from link
url = 'https://i0.wp.com/epthinktank.eu/wp-content/uploads/2022/06/AdobeStock_456540956.jpeg?fit=4865%2C3000&ssl=1'
# Display image from URL with smaller size and subtitle
from IPython.display import Image, display
# Set the desired image width and height
width = 600
height = 300
# Set the subtitle text
subtitle = "© European Parliamentary Research Service"
# Create an Image instance with the URL
image = Image(url=url, width=width, height=height)
# Display the image and subtitle
display(image)
print(subtitle)

© European Parliamentary Research Service
Introduction#
This data story will revolve around the inherent correlations between human development and suicide.
Dataset and Preprocessing#
Dataset 1#
Database 1 contains data of 185 countries, revolved around the Human Development Index (HDI) of each country, measured in 2015 specifically. The HDI is a value between 0 and 1 (low-high) that indicates human development by looking at health, education, and standard of living. The database contains each variable that is considered for the HDI value, but we are only interested in the HDI values themselves. Thus, as for preprocessing, all columns except the country and its HDI value are removed. https://www.kaggle.com/datasets/undp/human-development?select=human_development.csv
Database 2 is in terms of data-context identical to the first one, yet it revolves around the Gender Development Index (GDI), instead of the HDI. The GDI is also a value between 0 and 1 (low-high), that indicates equality in human development specifically between male and female. To clarify, the higher the gender equality, the higher the GDI value will be. Similarly to the preprocessing of database one, will all rows except the country and its GDI value be removed. https://www.kaggle.com/datasets/undp/human-development?select=human_development.csv
Database 3 shows the suicide rate per 100k/civillians for 182 countries from the year 2000 up to 2019. This dataset is however reduced to the year 2015 only, as databases 1 and 2 are measured in that year. Databases 1,2,3 are then merged together into Dataset 1, and the columns are renamed to more practical names. Rows that include one or more ‘empty’ values are removed from the dataset. Dataset 1 is used for the Development Perspective. https://www.kaggle.com/datasets/sandragracenelson/suicide-rate-of-countries-per-every-year
Dataset 2&3#
Database 4 contains suicide rates for 183 countries for four years (2000, 2010, 2015, 2016). Each country accounts for three rows of data, one for male suicides, one for female, and one for both. This database is first merged together with the HDI and GDI statistics from Dataset 1. Then it is seperated into a female suicide dataset (dataset 2) and a male suicide dataset (dataset 3). This had to be done by changing the tags for ‘female’ and ‘male’ to FeM and Male (using str.contains() and str.replace()), as database 4 must have included some white spacing in between the tags, making it inpossible to simply seperate for ‘female’ and ‘male’. Datasets 2&3 are used for the Gender Perspective exclusively. https://www.kaggle.com/datasets/twinkle0705/mental-health-and-suicide-rates?select=Age-standardized+suicide+rates.csv
Dataset 4#
Database 5 is a huge dataset that kept track of 200 countries and their reign information (41 columns) for each year available up to 2021. As we are interested in human development-related statistics, we select the variables government type, political violence, and the democracy boolean (1.0 for democracy, 0.0 for non-democracy). Seven countries from database 5 are renamed so that they can be correctly merged with dataset 1. For each country, the most recently included datapoint (by year) is selected, and then the year is dropped from the dataset. Finally some columns are renamed for clarity, and so that they can be merged with Dataset 1 in the Main Dataset. https://www.kaggle.com/datasets/janzasadny/rulers-elections-and-irregular-governance
Main Dataset#
The Main Dataset merges Dataset 1 and 4 together, which leaves a total of 140 countries and their data of HDI, GDI, Average Suicides from 2015, democracy status, government type, and political violence. This dataset is used for the Political Perspective.
Important to note that we have tagged unedited data structures as ‘Databases 1-5’. Edited datasets are mentioned as ‘Datasets 1-5-Main’.
Imports & Installs#
Show code cell source
!pip install -U numpy
!pip install -U plotly
!pip install -U pandas
!pip install -U matplotlib
!pip install -U seaborn
!pip install -U geopandas
!pip install -U matplotlib
!pip install -U ipywidgets
Show code cell output
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: numpy in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (1.25.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: plotly in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (5.15.0)
Requirement already satisfied: tenacity>=6.2.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from plotly) (8.2.2)
Requirement already satisfied: packaging in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from plotly) (23.1)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: pandas in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (2.0.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas) (2023.3)
Requirement already satisfied: numpy>=1.21.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas) (1.25.0)
Requirement already satisfied: six>=1.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: matplotlib in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (1.0.7)
Requirement already satisfied: cycler>=0.10 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (4.39.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (1.4.4)
Requirement already satisfied: numpy>=1.20 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (1.25.0)
Requirement already satisfied: packaging>=20.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (3.1.0)
Requirement already satisfied: python-dateutil>=2.7 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (2.8.2)
Requirement already satisfied: six>=1.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: seaborn in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (0.12.2)
Requirement already satisfied: numpy!=1.24.0,>=1.17 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from seaborn) (1.25.0)
Requirement already satisfied: pandas>=0.25 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from seaborn) (2.0.3)
Requirement already satisfied: matplotlib!=3.6.1,>=3.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from seaborn) (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.0.7)
Requirement already satisfied: cycler>=0.10 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (4.39.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (1.4.4)
Requirement already satisfied: packaging>=20.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (3.1.0)
Requirement already satisfied: python-dateutil>=2.7 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib!=3.6.1,>=3.1->seaborn) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas>=0.25->seaborn) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas>=0.25->seaborn) (2023.3)
Requirement already satisfied: six>=1.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.1->seaborn) (1.16.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: geopandas in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (0.13.2)
Requirement already satisfied: fiona>=1.8.19 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from geopandas) (1.9.4.post1)
Requirement already satisfied: packaging in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from geopandas) (23.1)
Requirement already satisfied: pandas>=1.1.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from geopandas) (2.0.3)
Requirement already satisfied: pyproj>=3.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from geopandas) (3.6.0)
Requirement already satisfied: shapely>=1.7.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from geopandas) (2.0.1)
Requirement already satisfied: attrs>=19.2.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from fiona>=1.8.19->geopandas) (23.1.0)
Requirement already satisfied: certifi in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from fiona>=1.8.19->geopandas) (2023.5.7)
Requirement already satisfied: click~=8.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from fiona>=1.8.19->geopandas) (8.1.3)
Requirement already satisfied: click-plugins>=1.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from fiona>=1.8.19->geopandas) (1.1.1)
Requirement already satisfied: cligj>=0.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from fiona>=1.8.19->geopandas) (0.7.2)
Requirement already satisfied: six in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from fiona>=1.8.19->geopandas) (1.16.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas>=1.1.0->geopandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas>=1.1.0->geopandas) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas>=1.1.0->geopandas) (2023.3)
Requirement already satisfied: numpy>=1.21.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pandas>=1.1.0->geopandas) (1.25.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: matplotlib in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (3.7.1)
Requirement already satisfied: contourpy>=1.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (1.0.7)
Requirement already satisfied: cycler>=0.10 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (4.39.4)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (1.4.4)
Requirement already satisfied: numpy>=1.20 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (1.25.0)
Requirement already satisfied: packaging>=20.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (23.1)
Requirement already satisfied: pillow>=6.2.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (3.1.0)
Requirement already satisfied: python-dateutil>=2.7 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from matplotlib) (2.8.2)
Requirement already satisfied: six>=1.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
/bin/bash: /home/lloyd/miniconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
Requirement already satisfied: ipywidgets in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (8.0.6)
Requirement already satisfied: ipykernel>=4.5.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipywidgets) (6.23.1)
Requirement already satisfied: ipython>=6.1.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipywidgets) (8.14.0)
Requirement already satisfied: traitlets>=4.3.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipywidgets) (5.9.0)
Requirement already satisfied: widgetsnbextension~=4.0.7 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipywidgets) (4.0.7)
Requirement already satisfied: jupyterlab-widgets~=3.0.7 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipywidgets) (3.0.7)
Requirement already satisfied: comm>=0.1.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (0.1.3)
Requirement already satisfied: debugpy>=1.6.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (1.6.7)
Requirement already satisfied: jupyter-client>=6.1.12 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (8.2.0)
Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (5.3.0)
Requirement already satisfied: matplotlib-inline>=0.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (0.1.6)
Requirement already satisfied: nest-asyncio in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (1.5.6)
Requirement already satisfied: packaging in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (23.1)
Requirement already satisfied: psutil in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (5.9.5)
Requirement already satisfied: pyzmq>=20 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (25.1.0)
Requirement already satisfied: tornado>=6.1 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets) (6.2)
Requirement already satisfied: backcall in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.2.0)
Requirement already satisfied: decorator in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (5.1.1)
Requirement already satisfied: jedi>=0.16 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.18.2)
Requirement already satisfied: pickleshare in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.7.5)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (3.0.38)
Requirement already satisfied: pygments>=2.4.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (2.15.1)
Requirement already satisfied: stack-data in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (0.6.2)
Requirement already satisfied: pexpect>4.3 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets) (4.8.0)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets) (0.8.3)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets) (2.8.2)
Requirement already satisfied: platformdirs>=2.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from jupyter-core!=5.0.*,>=4.12->ipykernel>=4.5.1->ipywidgets) (3.5.1)
Requirement already satisfied: ptyprocess>=0.5 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets) (0.7.0)
Requirement already satisfied: wcwidth in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython>=6.1.0->ipywidgets) (0.2.6)
Requirement already satisfied: executing>=1.2.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from stack-data->ipython>=6.1.0->ipywidgets) (1.2.0)
Requirement already satisfied: asttokens>=2.1.0 in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from stack-data->ipython>=6.1.0->ipywidgets) (2.2.1)
Requirement already satisfied: pure-eval in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from stack-data->ipython>=6.1.0->ipywidgets) (0.2.2)
Requirement already satisfied: six in /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages (from asttokens>=2.1.0->stack-data->ipython>=6.1.0->ipywidgets) (1.16.0)
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
WARNING: Skipping /home/lloyd/miniconda3/envs/jupyterbook/lib/python3.11/site-packages/numpy-1.24.3.dist-info due to invalid metadata entry 'name'
This data story will explore three different perspectives.
The first perspective is that the HDI and GDI have iflunece on the average suicides per country
The second perspective we will look specifically at the suicides per gender
The third perspective we will look at the geographical distribution of suicide rates based on GDI and HDI levels
Your First Perspective#
The first perspective is that the HDI and GDI have influence on the average suicides per country
The First Argument of Your First Perspective#
For this part, we will be looking at the relationship between the Human Development Index (HDI) of a country and the avarage suicide rate in a country. The HDI is an index used to measure the overall development and well-being of countries. It takes into account several factors such as education, life expectancy and income to give an overall idea of a country’s level of development. The average suicide rate refers to how many people per 100.000 commit suicide in a year, in this case 2015.
Show code cell source
import pandas as pd
import numpy as np
import plotly.graph_objects as go
df = pd.read_csv('databases/IV DATASET 1.csv')
hdi = df['HDI 2015']
average_suicide = df['Average suicide 2015']
# Convert to NumPy arrays
hdi = np.array(hdi)
average_suicide = np.array(average_suicide)
# Exclude NaN and inf values
valid_indices = np.isfinite(hdi) & np.isfinite(average_suicide)
hdi = hdi[valid_indices]
average_suicide = average_suicide[valid_indices]
# Calculate mean values
mean_hdi = np.mean(hdi)
mean_average_suicide = np.mean(average_suicide)
# Calculate the least squares regression line
A = np.vstack([hdi, np.ones(len(hdi))]).T
m, c = np.linalg.lstsq(A, average_suicide, rcond=None)[0]
# Create scatter plot
fig = go.Figure(data=go.Scatter(x=hdi, y=average_suicide, mode='markers'))
# Add the least squares trendline
fig.add_trace(go.Scatter(x=hdi, y=m * hdi + c, mode='lines', name='Least Squares Trendline'))
# Set labels and title
fig.update_layout(
xaxis_title='HDI',
yaxis_title='Average Suicide',
title='HDI vs. Average Suicide'
)
# Show the plot
fig.show()
*Figure 1: Human Development Index c.a. 2015 on X-axis. Average suicides per 100k civillians (as measured in 2015) on Y-axis. Blue dots represent the countries and their corresponding data. (Least Squared) Trendline shows the (small) negative correlation between the HDI and average suicides. In the charts you can make out that there exist a relationship between the HDI and the average suicide rate.
In the above graph, you can make out that there is a negative relationship between the HDI and the suicide rate. The higher the HDI is, the lower the suicide rate. This is because a higher HDI leads to lower suicide rates due to improved mental health awareness and support systems. Countries with a higher HDI tend to prioritize mental health as an essential component of society and invest in initiatives such as public education campaigns, accessible mental healthcare services and community support networks. By investing in these things, the suicide rate in these countries often decreases because people will find help before they make the fatal decision to end there lives.
Show code cell source
import geopandas as gpd
import pandas as pd
import plotly.express as px
from ipywidgets import interact
# Step 3: Load the shapefile or GeoJSON file
shapefile_path = 'countries_map/countries.shp'
shapefile_data = gpd.read_file(shapefile_path)
# Step 4: Load the CSV data
csv_file_path = 'databases/IV DATASET 1.csv'
csv_data = pd.read_csv(csv_file_path)
# Renaming Countries in shapefile data that they match to our .csv file
shapefile_data.loc[shapefile_data['NAME'] == 'United States of America', 'NAME'] = 'United States'
shapefile_data.loc[shapefile_data['NAME'] == 'Russia', 'NAME'] = 'Russian Federation'
shapefile_data.loc[shapefile_data['NAME'] == 'Dem. Rep. Congo', 'NAME'] = 'Congo (Democratic Republic of the)'
shapefile_data.loc[shapefile_data['NAME'] == 'Iran', 'NAME'] = 'Iran (Islamic Republic of)'
shapefile_data.loc[shapefile_data['NAME'] == 'Tanzania', 'NAME'] = 'Tanzania (United Republic of)'
shapefile_data.loc[shapefile_data['NAME'] == 'South Korea', 'NAME'] = 'Korea (Republic of)'
shapefile_data.loc[shapefile_data['NAME'] == 'Venezuela', 'NAME'] = 'Venezuela (Bolivarian Republic of)'
shapefile_data.loc[shapefile_data['NAME'] == 'Bolivia', 'NAME'] = 'Bolivia (Plurinational State of)'
shapefile_data.loc[shapefile_data['NAME'] == 'Venezuela', 'NAME'] = 'Venezuela (Bolivarian Republic of)'
shapefile_data.loc[shapefile_data['NAME'] == 'Laos', 'NAME'] = "Lao People's Democratic Republic"
# Step 5: Merge shapefile data with CSV data using country names
merged_data = shapefile_data.merge(csv_data, left_on='NAME', right_on='Country', how='left')
# Step 6: Remove outliers based on IQR
Q1 = merged_data['Average suicide 2015'].quantile(0.25)
Q3 = merged_data['Average suicide 2015'].quantile(0.90)
IQR = Q3 - Q1
lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR
filtered_data = merged_data[
(merged_data['Average suicide 2015'] >= lower_bound) &
(merged_data['Average suicide 2015'] <= upper_bound)
]
# Define the function to update the map based on the selected index
def update_map(index):
if index == 0:
column_name = 'HDI 2015'
title = 'HDI 2015 Worldwide'
elif index == 1:
column_name = 'Average suicide 2015'
title = 'Average Suicides/100k Worldwide, 2015'
fig = px.choropleth(
filtered_data,
geojson=filtered_data.geometry,
locations=filtered_data.index,
color=column_name,
color_continuous_scale='YlOrRd',
range_color=(filtered_data[column_name].min(), filtered_data[column_name].max()),
projection="natural earth"
)
fig.update_layout(
title=title,
coloraxis_colorbar=dict(
title=column_name,
len=0.8,
thickness=20,
ypad=0,
yanchor="top",
ticks="outside",
tickvals=[filtered_data[column_name].min(), filtered_data[column_name].max()],
ticktext=[str(filtered_data[column_name].min()), str(filtered_data[column_name].max())]
),
geo=dict(
showframe=False,
showcoastlines=False,
projection_type="natural earth"
)
)
fig.show()
# Use the interact function to create the interactive widget
interact(update_map, index=[('HDI', 0), ('Average Suicides/100k', 1)])
<function __main__.update_map(index)>
*Figure 2: The visualization above shows a world map with the HDI and the average suicide rate of countries with available data. It is possible to switch between a HDI and a suicide rate map. The darker shade of red the country is, the higher the HDI or suicide rate is.
In the world map, you can clearly see that in general, for a country, if the HDI is up, so is the suicide rate. This can be explained by looking at the education and economic opportunities in a country. The darker areas on the HDI world map, such as West-Europe, North America and Oceania are also known for the high level education they have and their great economic opportunities. A higher HDI often indicates better education and economic opportunities, enabling people to set meaningful objectives and goals in their lives. Then people tend to have less suicidal thoughts. In the countries that are lighter on the HDI map, such as most of Africa, the Middle East and Latin-America, people have less economic opportunity and education. Because of this, it’s harder for them to set goals in their lives and that can contribute to having a harder time to find meaning in their lives.
Overall, the relationship between HDI and suicide rates is complex. There are many other factors that contribute to this relationship, such as cultural or social aspects.
GENDER PERSPECTIVE#
For this part of the data story, there will be looked at the relationship between the Gender Development Index and HDI, and the male- and female suicide rates. There will specifically be looked at the fact that an increase in the HDI and/or GDI has a different effect between the genders. The Gender Development Index (GDI) is an index for overall development of men and women in a country. It helps us see if both genders have equal opportunities and access to the variables of the HDI, that includes things like education ,healthcare and economic opportunities. It is important to note that the GDI is not at all the same as gender inequality. Whereas the GDI focuses on the equality of access and opportunities for both genders, gender inequality focuses on the unequal distribution of power, resources, and opportunities based on gender. This can include things like unequal pay, underrepresentation in leadership roles or violence based on gender. The male- and female suicide rates represent the average amount of suicides for that gender per 100.000 people of that gender.
Show code cell source
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
df = pd.read_csv('databases/DATABASE 4.csv')
df1 = pd.read_csv('databases/IV DATASET 1.csv')
df1 = df1[['Country', 'GDI 2015', 'HDI 2015']]
df = pd.merge(df, df1, on='Country')
df['Sex'] = df['Sex'].str.replace('female', 'F').str.replace('male', 'M')
female_suicides = df[df['Sex'].str.contains('fem', case=False)]
male_suicides = df[df['Sex'].str.contains('male', case=False)]
female_suicides_2015 = female_suicides[['Country', '2015', 'GDI 2015', 'HDI 2015']]
male_suicides_2015 = male_suicides[['Country', '2015', 'GDI 2015', 'HDI 2015']]
# Merge female and male datasets on Country
merged_data = pd.merge(female_suicides_2015, male_suicides_2015, on='Country', suffixes=('_Female', '_Male'))
# Select the relevant columns for HDI, GDI, and suicide rates
data = merged_data[['Country', 'HDI 2015_Female', 'HDI 2015_Male', 'GDI 2015_Female', 'GDI 2015_Male', '2015_Female', '2015_Male']]
# Calculate trendlines for HDI (female and male)
z_hdi_female = np.polyfit(data['HDI 2015_Female'], data['2015_Female'], 1)
p_hdi_female = np.poly1d(z_hdi_female)
z_hdi_male = np.polyfit(data['HDI 2015_Male'], data['2015_Male'], 1)
p_hdi_male = np.poly1d(z_hdi_male)
# Calculate trendlines for GDI (female and male)
z_gdi_female = np.polyfit(data['GDI 2015_Female'], data['2015_Female'], 1)
p_gdi_female = np.poly1d(z_gdi_female)
z_gdi_male = np.polyfit(data['GDI 2015_Male'], data['2015_Male'], 1)
p_gdi_male = np.poly1d(z_gdi_male)
# Create plotly scatter plots for GDI and HDI
fig = make_subplots(rows=1, cols=2, subplot_titles=['Relationship between GDI and Suicide Rate', 'Relationship between HDI and Suicide Rate'])
# Scatter plot for GDI
fig.add_trace(go.Scatter(
x=data['GDI 2015_Female'],
y=data['2015_Female'],
mode='markers',
marker=dict(color='red'),
name='Female'
), row=1, col=1)
fig.add_trace(go.Scatter(
x=data['GDI 2015_Male'],
y=data['2015_Male'],
mode='markers',
marker=dict(color='blue'),
name='Male'
), row=1, col=1)
# Trendline for GDI
gdi_range = np.linspace(data['GDI 2015_Female'].min(), data['GDI 2015_Female'].max(), 100)
fig.add_trace(go.Scatter(
x=gdi_range,
y=p_gdi_female(gdi_range),
mode='lines',
line=dict(color='red', dash='dash'),
name='Female Trend'
), row=1, col=1)
fig.add_trace(go.Scatter(
x=gdi_range,
y=p_gdi_male(gdi_range),
mode='lines',
line=dict(color='blue', dash='dash'),
name='Male Trend'
), row=1, col=1)
fig.update_xaxes(title_text='GDI 2015', row=1, col=1)
fig.update_yaxes(title_text='Average Suicide Rate (2015)', row=1, col=1)
# Scatter plot for HDI
fig.add_trace(go.Scatter(
x=data['HDI 2015_Female'],
y=data['2015_Female'],
mode='markers',
marker=dict(color='red'),
name='Female'
), row=1, col=2)
fig.add_trace(go.Scatter(
x=data['HDI 2015_Male'],
y=data['2015_Male'],
mode='markers',
marker=dict(color='blue'),
name='Male'
), row=1, col=2)
# Trendline for HDI
hdi_range = np.linspace(data['HDI 2015_Female'].min(), data['HDI 2015_Female'].max(), 100)
fig.add_trace(go.Scatter(
x=hdi_range,
y=p_hdi_female(hdi_range),
mode='lines',
line=dict(color='red', dash='dash'),
name='Female Trend'
), row=1, col=2)
fig.add_trace(go.Scatter(
x=hdi_range,
y=p_hdi_male(hdi_range),
mode='lines',
line=dict(color='blue', dash='dash'),
name='Male Trend'
), row=1, col=2)
fig.update_xaxes(title_text='HDI 2015', row=1, col=2)
fig.update_yaxes(title_text='Average Suicide Rate (2015)', row=1, col=2)
fig.update_layout(showlegend=True)
# Display the plot
fig.show()
*Figure 3a: Gender Development Index c.a. 2015 in X-axis. Average suicide rate (2015) per 100k civillians on Y-axis. Blue dots represent male suicides / red dots represent female suicides. Blue (Least Squared) trendline shows a significant positive correlation between GDI and male suicide. Red (Least Squared) trendline shows a (small) negative correlation between GDI and female suicide.
*Figure 3b: Human Development Index c.a. 2015 in X-axis. Average suicide rate (2015) per 100k civillians on Y-axis. Blue dots represent male suicides / red dots represent female suicides. Blue (Least Squared) trendline shows a (very small) positive correlation between HDI and male suicide. Red (Least Squared) trendline shows a (small) negative correlation between HDI and female suicide.
In figure 3a there can be concluded that the GDI and male suicides have a positive correlation, and the GDI and women suicides have a negative correlation. This means the higher the GDI is, the more suicide men commit. This can be explained by the fact that often with a high GDI, the gender roles quickly change. This means that more women are entering the workforce, pursuing higher education, and gaining financial independence. This Simultaneously also means that there is an increased recognition of the importance of men engaging in what was before ‘women roles’. Think of caregiving, emotional expression, and sharing household responsibilities. This quick change can make men feel less important and/or can cause confusion, frustration and identity crisis. This can contribute to more depressive and suicide thoughts, which explains the increase in male suicide. The same argument explains the negative correlation between the GDI and women suicides. Because of the shifting in the gender roles, women find themselves in the opposite position of men. For them, new opportunities are opening up, which can help set goals in life. This internally can help to find meaning in life improve mental health, which can lower the suicide rates.
Show code cell source
import pandas as pd
import plotly.graph_objects as go
# Read the dataset
males = pd.read_csv('databases/IV DATASET 3.csv')
females = pd.read_csv('databases/IV DATASET 2.csv')
males['HDI Classification'] = males['HDI 2015'].apply(lambda x: 'HD' if x >= 0.85 else 'LD')
females['HDI Classification'] = females['HDI 2015'].apply(lambda x: 'HD' if x >= 0.85 else 'LD')
males['GDI Classification'] = males['GDI 2015'].apply(lambda x: 'HD' if x >= 0.97 else 'LD')
females['GDI Classification'] = females['GDI 2015'].apply(lambda x: 'HD' if x >= 0.97 else 'LD')
# Group the data by HDI and GDI classifications for males
hh_df_m = males[(males['HDI Classification'] == 'HD') & (males['GDI Classification'] == 'HD')]
hl_df_m = males[(males['HDI Classification'] == 'HD') & (males['GDI Classification'] == 'LD')]
lh_df_m = males[(males['HDI Classification'] == 'LD') & (males['GDI Classification'] == 'HD')]
ll_df_m = males[(males['HDI Classification'] == 'LD') & (males['GDI Classification'] == 'LD')]
# Group the data by HDI and GDI classifications for females
hh_df_f = females[(females['HDI Classification'] == 'HD') & (females['GDI Classification'] == 'HD')]
hl_df_f = females[(females['HDI Classification'] == 'HD') & (females['GDI Classification'] == 'LD')]
lh_df_f = females[(females['HDI Classification'] == 'LD') & (females['GDI Classification'] == 'HD')]
ll_df_f = females[(females['HDI Classification'] == 'LD') & (females['GDI Classification'] == 'LD')]
# Extract the necessary columns for plotting
years = ['2000', '2010', '2015']
male_hh = hh_df_m[years].mean()
male_hl = hl_df_m[years].mean()
male_lh = lh_df_m[years].mean()
male_ll = ll_df_m[years].mean()
female_hh = hh_df_f[years].mean()
female_hl = hl_df_f[years].mean()
female_lh = lh_df_f[years].mean()
female_ll = ll_df_f[years].mean()
# Define color and marker styles
colors = ['dodgerblue', 'salmon', 'limegreen', 'purple', 'blue', 'red', 'green', 'orange']
markers = ['circle', 'square', 'diamond', 'triangle-up', 'circle', 'square', 'diamond', 'triangle-up']
labels = ['Male HD-HD', 'Male HD-LD', 'Male LD-HD', 'Male LD-LD', 'Female HD-HD', 'Female HD-LD', 'Female LD-HD', 'Female LD-LD']
# Create the line plot
fig = go.Figure()
# Add traces for each category
for i, data in enumerate([male_hh, male_hl, male_lh, male_ll, female_hh, female_hl, female_lh, female_ll]):
fig.add_trace(go.Scatter(
x=years,
y=data,
mode='lines+markers',
name=labels[i],
marker=dict(color=colors[i], symbol=markers[i], size=8),
line=dict(width=2)
))
# Set plot title and labels
fig.update_layout(
title='Suicide Development for Male and Female for HD and LD countries',
xaxis=dict(title='Year'),
yaxis=dict(title='Suicide Rates'),
legend=dict(font=dict(size=7), orientation='h', yanchor='top', xanchor='right', x=1, y=1),
showlegend=True,
template='plotly_white'
)
# Display the plot
fig.show()
*Figure 4: Years from 2000 - 2015 on X-axis. Average suicide rate per 100k civillians on Y-axis. Figure shows the development of male- and female suicides throughout the year. Plots are represented in this format: Sex - HDI Development status - GDI Development status. HD stands for Highly Developed countries (>0.85 for HDI, 0.97 for GDI), LD stands for Lowly Developed countries (<= 0.85 for HDI, <= 0.97 for GDI).
Hier moet nog uitleg / onderbouwing argumenten ofz#
POLITICAL PERSPECTIVE#
Hier nog stukkie tekst#
Show code cell source
import pandas as pd
import numpy as np
import plotly.graph_objects as go
# Read the data into a DataFrame
df = pd.read_csv('databases/MAIN DATASET.csv')
df = df.drop('Unnamed: 0', axis=1)
# Extract relevant columns
government_types = df['government']
hdi_values = df['HDI 2015']
democracy_status = df['democracy']
political_violence = df['political_violence']
# Grouping average HDI and political violence values by government type and democracy status
government_avg_hdi = {}
government_avg_political_violence = {}
government_democracy = {}
for i in range(len(government_types)):
government_type = government_types[i]
hdi_value = hdi_values[i]
is_democracy = democracy_status[i]
violence_value = political_violence[i]
if government_type not in government_avg_hdi:
government_avg_hdi[government_type] = []
government_avg_hdi[government_type].append(hdi_value)
if government_type not in government_avg_political_violence:
government_avg_political_violence[government_type] = []
government_avg_political_violence[government_type].append(violence_value)
government_democracy[government_type] = is_democracy
# Calculate average HDI and political violence values for each government type
avg_hdi_values = [np.mean(government_avg_hdi[gt]) for gt in government_avg_hdi]
avg_political_violence_values = [np.mean(government_avg_political_violence[gt]) for gt in government_avg_political_violence]
# Sort the bars based on the average political violence values in ascending order
sorted_indices = np.argsort(avg_political_violence_values)
sorted_avg_hdi_values = [avg_hdi_values[i] for i in sorted_indices]
sorted_avg_political_violence_values = [avg_political_violence_values[i] for i in sorted_indices]
sorted_government_types = [list(government_avg_hdi.keys())[i] for i in sorted_indices]
# Prepare data for plotting
x = np.arange(len(sorted_avg_hdi_values))
# Create the bar trace
bar_trace = go.Bar(
x=x,
y=sorted_avg_hdi_values,
name='Average HDI',
marker=dict(color='blue')
)
# Create the text annotations for democracy status and political violence
text_annotations = []
for i, gov_type in enumerate(sorted_government_types):
is_democracy = government_democracy[gov_type]
if is_democracy == 1.0:
text_annotations.append(dict(
x=i,
y=-0.1,
text='D',
showarrow=False,
font=dict(color='green')
))
else:
text_annotations.append(dict(
x=i,
y=-0.1,
text='ND',
showarrow=False,
font=dict(color='red')
))
text_annotations.append(dict(
x=i,
y=-0.15,
text=f'{round(sorted_avg_political_violence_values[i], 1)}',
showarrow=False,
font=dict(color='black')
))
# Create the trendline trace
trendline_trace = go.Scatter(
x=x,
y=np.poly1d(np.polyfit(x, sorted_avg_hdi_values, 1))(x),
mode='lines',
name='Trendline',
line=dict(color='red', dash='dash')
)
# Set layout
layout = go.Layout(
title='Average HDI by Government / Political Violence',
xaxis=dict(
tickvals=x,
ticktext=sorted_government_types,
tickangle=-45,
tickfont=dict(size=10),
showticklabels=True
),
yaxis=dict(title='Average HDI'),
annotations=text_annotations,
showlegend=True,
legend=dict(font=dict(size=10), x=1, y=1, bgcolor='rgba(0, 0, 0, 0)'),
template='plotly_white'
)
# Create the figure
fig = go.Figure(data=[bar_trace, trendline_trace], layout=layout)
# Display the plot
fig.show()
*Figure 4: Bar chart for 14 different government types. Government types on X-axis, together with average political violence score (the higher the score, the more politically violent the government type is), Democracy indice (D for Democracy, ND for Non-Democracy). Average HDI on Y-axis. Government types are sorted left to right based on political violence score (low to high). The trendline proves a negative correlation between political violence and HDI (lower political violence - higher HDI).
Stukje tekst erover#
Show code cell source
import pandas as pd
import plotly.graph_objects as go
# Read the dataset
df = pd.read_csv('databases/MAIN DATASET.csv')
df = df.drop('Unnamed: 0', axis=1)
# Filter the relevant columns
df_democracy = df[df['government'].str.contains('Democracy')]
df_non_democracy = df[~df['government'].str.contains('Democracy')]
# Calculate mean values for each category
categories = ['Average Suicide Rate', 'Political Violence', 'HDI']
democracy_data = [df_democracy['Average suicide 2015'].mean(), df_democracy['political_violence'].mean(), df_democracy['HDI 2015'].mean()]
non_democracy_data = [df_non_democracy['Average suicide 2015'].mean(), df_non_democracy['political_violence'].mean(), df_non_democracy['HDI 2015'].mean()]
# Set up colors and styles
bar_colors = ['#008FD5', '#FF2700']
bar_width = 0.35
bar_positions = list(range(len(categories)))
# Create bar traces
democracy_trace = go.Bar(
x=categories,
y=democracy_data,
name='Democracy',
marker=dict(color=bar_colors[0]),
width=bar_width,
opacity=0.8,
showlegend=True
)
non_democracy_trace = go.Bar(
x=categories,
y=non_democracy_data,
name='Non-Democracy',
marker=dict(color=bar_colors[1]),
width=bar_width,
opacity=0.8,
showlegend=True
)
# Create layout
layout = go.Layout(
title='Comparison of Democracy and Non-Democracy',
xaxis=dict(title='Categories'),
yaxis=dict(title='Mean Value'),
barmode='group',
showlegend=True,
legend=dict(x=1, y=1, bgcolor='rgba(0, 0, 0, 0)'),
template='plotly_white'
)
# Create the figure
fig = go.Figure(data=[democracy_trace, non_democracy_trace], layout=layout)
# Display the plot
fig.show()
*Figure 5: Bar chart for Democratic and Non-Democratic countries. Three categories on X-axis: Average Suicide per 100k civillians - Average Political Violence - Average HDI. Y-axis represents a scale for those three categories. Figure shows that on average democratic countries have more suicides, more political violence, despite having a higher HDI.
Reflection#
On the draft version of the data story, we had received some feedback.
At first the code did not run, because we did not import the proper libraries. We quickly solved this by putting the imports and installs at the top of our notebook.
The first plot was not meaningful; it did not explain anything in the blink of an eye. We decided to remove the plot as a whole, and replaced it with a more meaningful graph.
The code-inputs were not hidden. We managed to solve this issue with some help from Teaching Assistants during phsyical classes.
Our Point of Views were not clear, the plots did not match one certain PoV, and the data story was not a smooth story, but it was rather unmethodically placed parts of information. To resolve this, we took a good look at our perspectives, the graphs that belong to those perspectives, and then we tried connect all pieces of loose information together into one overarching data story.
During the peer feedback, we had received tips from the other two groups.
The graphs were hard to read, and some did not make a lot of sense. They gave examples on what would be a better and clearer understanding graph. We made sure to apply this to our graphs by adding colours, better names for the axes, which, together with the newly added captions should result in more visually appealing visualizations.
The combination of using the Gender Development Index (GDI) and the Gender Inequality Index (GII) made the graphs unclear whether it was a good thing that the values in the graphs raised, or a bad thing. We had therefore decided to remove GII as a whole, and only discuss GDI in the Gender Perspective. This should make are data story less confusing as a whole, and more ‘simple yet meaningfull’
The data story did not feel like a story, but rather parts put together.
We examined the feedback provided and worked on the points. Some graphs we have excluded and others we adjusted for more clarity. We worked on making the perspectives not be a separate part of the story, but more merged as a whole story.
Work Distribution#
Work Distribution:
For our project, we divided the workload among the three of us to ensure an even distribution of responsibilities. Since we do not have a fourth member in our group, we tried to divide the workload evenly. Each team member contributed to different sections of the project as follows:
Introduction: Developed by Julian
Julian took the responsibility of crafting a comprehensive and engaging introduction section for the project.
Preprocessing: Managed by Lloyd
Lloyd played a key role in preprocessing the project data, cleaning and transforming it to make it suitable for analysis. He implemented data cleaning techniques, handled missing values, and ensured data integrity.
Perspectives: Divided between Mitch, Julian & Lloyd
We decided that each person would be assigned one perspective to work on. This way the work is divided evenly between the three of us.
Reflection, Work Distribution & Appendix: Handled by Mitch
Mitch took the lead in crafting the reflection section, providing thoughtful insights, and analyzing the outcomes and lessons learned from the project. He also took charge of the work distribution process and the appendix.
By splitting the workload in this manner, we aimed to leverage each team member’s strengths and ensure a balanced contribution from everyone involved. This approach allowed us to efficiently complete the project while maintaining consistency and quality across all sections.
References#
Global Suicide Data. https://www.kaggle.com/datasets/twinkle0705/mental-health-and-suicide-rates?select=Age-standardized+suicide+rates.csv
Global Suicide Data 2000-2019 https://www.kaggle.com/datasets/sandragracenelson/suicide-rate-of-countries-per-every-year
Human Development Report 2015. https://www.kaggle.com/datasets/undp/human-development?select=human_development.csv
Government types of the world. https://www.kaggle.com/datasets/janzasadny/rulers-elections-and-irregular-governance
Appendix#
Generative AI (ChatGPT with GPT 3.5) is used to facilitate the creation of this document, as shown in the table below.
Reasons of Usage |
In which parts? |
Which prompts were used? |
|---|---|---|
Brainstorming multiple perspectives |
The entire project framing |
“Give examples of perspectives about HDI, GDI and suicide rates per country” |
Improve writing clarity and enhance readability |
All sections |
“Edit the following text to make it more clear. Do not alter the meaning.” |
Merge perspectives |
All sections |
“Revise the following text to improve readability and flow of the story.” |
Ensure grammatical accuracy |
All sections |
“Correct any grammatical errors in the text.” |
Provide alternative phrasing |
Descriptions of the perspectives |
“Suggest alternative phrases for better clarity.” |
Graph ideas |
Visualizations |
“What kind of graphs are useful for the following keywords.” |
Graph generating |
Visualizations |
“How can i use this graph to give a clear understanding about the following subject.” |
Improving code |
Visualizations and Preprocessing |
“Make this code more efficient without losing important information.” |
Table 1: Usage of generative AI to facilitate the creation of this document.